Skip to content

Write out job metadata to a file in the same directory as refined.cif#132

Merged
k-chrispens merged 2 commits intomainfrom
mdc-write-metadata
Mar 5, 2026
Merged

Write out job metadata to a file in the same directory as refined.cif#132
k-chrispens merged 2 commits intomainfrom
mdc-write-metadata

Conversation

@marcuscollins
Copy link
Copy Markdown
Collaborator

@marcuscollins marcuscollins commented Mar 5, 2026

Behold, the world's simplest PR. Just write out some metadata so it will be easier to process the results of our occupancy sweeps.

This starts to address #121

Summary by CodeRabbit

  • New Features
    • Guidance jobs now write a job_metadata.json file into each job's output directory after completion.
    • Output directories are now created automatically as needed before jobs run.
  • Bug Fixes
    • Improved reliability of per-job metadata persistence and isolation so results and metadata are consistently saved.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 5, 2026

📝 Walkthrough

Walkthrough

Added an import for json, ensure the guidance output directory is created when running, and persist per-job metadata by writing a job_metadata.json file (serialized from the job object) into each job's output directory after executing the job. Removed an inactive commented block about wrapper reuse.

Changes

Cohort / File(s) Summary
Guidance script updates
src/sampleworks/utils/guidance_script_utils.py
Added import json; ensure args.output_dir exists in run_guidance; after each job in run_guidance_job_queue, serialize the job object and write job_metadata.json to the job's output directory; removed commented-out wrapper-reuse block.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐇 I hopped through code and left a mark,
A little JSON glowing in the dark,
Each job recorded, tidy and neat,
A rabbit's whisper in bytes so sweet,
Hop on—our runs are safe and stark. 🌙

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: writing job metadata to a file in the output directory alongside refined results.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch mdc-write-metadata

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/sampleworks/utils/guidance_script_utils.py (1)

584-584: Remove the redundant inline comment.

Line 584 restates what the code already makes clear; dropping it keeps this file closer to the repo’s “direct, readable” style.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/utils/guidance_script_utils.py` at line 584, Remove the
redundant inline comment that repeats the code’s action ("# write out the job
parameters to a JSON file in the same directory as the refined.cif file");
delete that comment line near the code that writes job parameters to JSON (the
block referencing refined.cif/job parameters) so the implementation (in
guidance_script_utils.py) remains direct and uncluttered.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sampleworks/utils/guidance_script_utils.py`:
- Around line 585-586: Ensure the metadata write is resilient by creating the
parent directory for job_result.output_dir
(Path(job_result.output_dir).mkdir(parents=True, exist_ok=True)) before opening
"job_metadata.json", and protect the json.dump with a try/except so one bad job
doesn't abort the queue: build a serializable copy of job.__dict__ (e.g.,
convert GuidanceConfig.model and GuidanceConfig.guidance_type to simple
serializable values like their class name or str(repr(...))), use json.dump on
that safe dict (or json.dump(..., default=str) to fallback for non-serializable
objects), and on any exception log the error and continue rather than
re-raising; these changes should be applied where run_guidance() writes job
metadata so the queue keeps processing even if serialization or missing
directories occur.

---

Nitpick comments:
In `@src/sampleworks/utils/guidance_script_utils.py`:
- Line 584: Remove the redundant inline comment that repeats the code’s action
("# write out the job parameters to a JSON file in the same directory as the
refined.cif file"); delete that comment line near the code that writes job
parameters to JSON (the block referencing refined.cif/job parameters) so the
implementation (in guidance_script_utils.py) remains direct and uncluttered.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2f0036a2-9da3-4d26-8cf3-470d5cd6e216

📥 Commits

Reviewing files that changed from the base of the PR and between 9ef3393 and f8732e9.

📒 Files selected for processing (1)
  • src/sampleworks/utils/guidance_script_utils.py

Comment thread src/sampleworks/utils/guidance_script_utils.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/sampleworks/utils/guidance_script_utils.py (1)

584-587: ⚠️ Potential issue | 🟠 Major

Previously flagged issues remain unresolved; additionally, consider including execution metadata.

The directory creation and serialization concerns from the previous review are still valid and unaddressed:

  1. output_dir may not exist if _run_guidance() fails before save_everything() runs
  2. json.dump(job.__dict__, ...) will raise TypeError for non-serializable fields (e.g., GuidanceType, StructurePredictor objects)
  3. Any exception aborts the entire job queue

New observation: Writing only job.__dict__ omits valuable execution results. Per the JobResult dataclass, fields like status, exit_code, runtime_seconds, started_at, and finished_at are only in job_result, not in job. Consider merging both for complete metadata.

Proposed fix incorporating both concerns
-        # write out the job parameters to a JSON file in the same directory as the refined.cif file
-        with open(Path(job_result.output_dir) / "job_metadata.json", "w") as fp:
-            json.dump(job.__dict__, fp)
+        # write out the job parameters and execution results to a JSON file
+        metadata_path = Path(job_result.output_dir) / "job_metadata.json"
+        try:
+            metadata_path.parent.mkdir(parents=True, exist_ok=True)
+            combined_metadata = {
+                "config": job.__dict__,
+                "result": job_result.__dict__,
+            }
+            with metadata_path.open("w", encoding="utf-8") as fp:
+                json.dump(combined_metadata, fp, indent=2, default=str)
+        except (OSError, TypeError) as exc:
+            logger.warning(f"Failed to write job metadata to {metadata_path}: {exc}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/utils/guidance_script_utils.py` around lines 584 - 587,
Ensure the job metadata write is robust: before opening
Path(job_result.output_dir) / "job_metadata.json" ensure the directory exists
(create parents with exist_ok=True) and perform the write atomically to avoid
partial files; merge serializable representations of job and job_result (use
job.__dict__ plus JobResult fields like status, exit_code, runtime_seconds,
started_at, finished_at) into a single dict; convert or filter non-serializable
values from GuidanceType/StructurePredictor (e.g., replace with their names/IDs
or call a to_dict() if available) or provide a default JSON serializer to avoid
TypeError when json.dump is called; wrap the file write in a try/except that
logs the exception (reference process/logger used elsewhere) instead of raising
so a single metadata write failure does not abort the entire job queue (locate
code around _run_guidance(), save_everything(), job_result.output_dir, and the
JobResult dataclass to implement these changes).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/sampleworks/utils/guidance_script_utils.py`:
- Around line 584-587: Ensure the job metadata write is robust: before opening
Path(job_result.output_dir) / "job_metadata.json" ensure the directory exists
(create parents with exist_ok=True) and perform the write atomically to avoid
partial files; merge serializable representations of job and job_result (use
job.__dict__ plus JobResult fields like status, exit_code, runtime_seconds,
started_at, finished_at) into a single dict; convert or filter non-serializable
values from GuidanceType/StructurePredictor (e.g., replace with their names/IDs
or call a to_dict() if available) or provide a default JSON serializer to avoid
TypeError when json.dump is called; wrap the file write in a try/except that
logs the exception (reference process/logger used elsewhere) instead of raising
so a single metadata write failure does not abort the entire job queue (locate
code around _run_guidance(), save_everything(), job_result.output_dir, and the
JobResult dataclass to implement these changes).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a49d25b9-96c1-4beb-a640-e61b36e13275

📥 Commits

Reviewing files that changed from the base of the PR and between f8732e9 and 3239abc.

📒 Files selected for processing (1)
  • src/sampleworks/utils/guidance_script_utils.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/sampleworks/utils/guidance_script_utils.py (1)

587-590: ⚠️ Potential issue | 🟠 Major

Prevent metadata-write I/O failures from aborting the remaining job queue.

If this write hits an OSError (e.g., permission/disk issues), the loop exits and skips remaining jobs. Since metadata is auxiliary, handle write failures and continue.

Proposed fix
-        # write out the job parameters to a JSON file in the same directory as the refined.cif file
-        with open(Path(job_result.output_dir) / "job_metadata.json", "w") as fp:
-            json.dump(job.__dict__, fp)
+        # write out the job parameters to a JSON file in the same directory as the refined.cif file
+        metadata_path = Path(job_result.output_dir) / "job_metadata.json"
+        metadata_path.parent.mkdir(parents=True, exist_ok=True)
+        try:
+            with metadata_path.open("w", encoding="utf-8") as fp:
+                json.dump(job.__dict__, fp)
+        except OSError as exc:
+            logger.warning(f"Failed to write job metadata to {metadata_path}: {exc}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sampleworks/utils/guidance_script_utils.py` around lines 587 - 590, The
metadata write (opening Path(job_result.output_dir) / "job_metadata.json" and
json.dump(job.__dict__, fp)) can raise OSError and currently aborts the job
loop; wrap the open/json.dump in a try/except that catches OSError (or OSError
and IOError for compatibility), log the error with context including
job_result.output_dir and job identifier (e.g., job.id or other distinguishing
field) using the module's logger, and continue without re-raising so remaining
jobs are processed. Ensure the except only swallows I/O-related exceptions and
does not hide other failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/sampleworks/utils/guidance_script_utils.py`:
- Around line 587-590: The metadata write (opening Path(job_result.output_dir) /
"job_metadata.json" and json.dump(job.__dict__, fp)) can raise OSError and
currently aborts the job loop; wrap the open/json.dump in a try/except that
catches OSError (or OSError and IOError for compatibility), log the error with
context including job_result.output_dir and job identifier (e.g., job.id or
other distinguishing field) using the module's logger, and continue without
re-raising so remaining jobs are processed. Ensure the except only swallows
I/O-related exceptions and does not hide other failures.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fc7d3025-c3e5-4c90-a988-b1b9c9740d04

📥 Commits

Reviewing files that changed from the base of the PR and between 3239abc and d60f807.

📒 Files selected for processing (1)
  • src/sampleworks/utils/guidance_script_utils.py

Copy link
Copy Markdown
Collaborator

@k-chrispens k-chrispens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

@k-chrispens k-chrispens merged commit b9005c9 into main Mar 5, 2026
1 check passed
@k-chrispens k-chrispens deleted the mdc-write-metadata branch March 5, 2026 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants